42 research outputs found

    On transforming spectral peaks in voice conversion

    No full text
    International audienceThis paper explores the benefits of transforming spectral peaks in voice conversion. First, in examining classic GMMbased transformation with cepstral coefficients, we show that the lack of transformed data variance ("over-smoothing") can be related to the choice of spectral parameterization. Consequently, we propose an alternative parameterization using spectral peaks. The peaks are transformed using HMMs with Gaussian state distributions. Two learning variants and post-processing treating peak evolution in time are also examined. In comparing the different transformation approaches, spectral peaks are shown to offer higher interspeaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts

    On the use of spectral peak parameters in voice conversion

    No full text
    International audienceThis paper addresses the problem of low transformed data variance, or "over-smoothing," in spectral transformation for Voice Conversion. In examining a classic GMM-based transformation with cepstral coefficients, we show that this problem lies, not only in the transformation model (as commonly assumed), but also in the choice of spectral parameterization. Consequently, we propose an alternative method for spectral transformation using spectral peaks and an HMM with Gaussian state distributions. The spectral peaks are shown to offer higher inter-speaker feature correlation and yield higher transformed data variance than their cepstral coefficient counterparts. Additionally, the accuracy of the transformed envelopes is examined

    Alleviating the one-to-many mapping problem in voice conversion with context-dependent modelling

    No full text
    International audienceThis paper addresses the "one-to-many" mapping problem in Voice Conversion (VC) by exploring source-to-target mappings in GMM-based spectral transformation. Specifically, we examine differences using source-only versus joint source/target information in the classification stage of transformation, effectively illustrating a "one-to-many effect" in the traditional acoustically-based GMM. We propose combating this effect by using phonetic information in the GMM learning and classification. We then show the success of our proposed context-dependent modeling with transformation results using an objective error criterion. Finally, we discuss implications of our work in adapting current approaches to VC

    Estimation d'enveloppes spectrales contraintes temporellement pour la conversion de voix

    No full text
    National audienceThis paper presents a new approach to estimating the speech spectral envelope that is adapted for Voice Conversion (VC). In particular, we represent the spectral envelope as a sum of peaks that evolve smoothly in time, within a phoneme. We highlight important properties of our proposed spectral envelope estimation and illustrate its potential for use in a VC context. We analyse natural speech using the proposed methods and we compare results with those from a more traditional frame-by-frame cepstrum-based analysis. Subjective comparisons of synthesized speech quality, as well as implications of this work in future research are also discussed

    Estimation du signal glottique basée sur un modèle ARX

    Get PDF
    ·Le but de cet article est d'estimer à partir du seul signal de parole le signal de source glottique. L'utilisation du modèle ARX de production de la parole ainsi que d'un modèle de source glottique transforme ce problème de déconvolution en un problème d'optimisation non linéaire. Nous présentons une méthode efficace pour résoudre ce problème ainsi que des résultats sur signaux synthétiques et réels

    Influence de la modélisation spectrale sur les performances d'un système de conversion de voix

    Get PDF
    - La conversion de voix est une technique qui consiste à modifier le signal de parole d'un locuteur de référence appelé aussi locuteur source, d'une façon telle qu'il semble, à l'écoute, être prononcé par le locuteur désiré. Dans ce papier, nous étudions l'influence de la modélisation spectrale sur la qualité de la conversion du timbre. Nous comparons, dans le cadre de la conversion par GMM, les modélisations par cepstre discret et par paramètres LSF. Des tests objectifs montrent que l'utilisation des paramètres LSF conduit à de meilleurs résultats de conversion

    Speech Technologies for African Languages: Example of a Multilingual Calculator for Education

    No full text
    International audienceThis paper presents our achievements after 18 months of the ALFFA project dealing with African languages technologies. We focus on a multilingual calculator (Android app) that will be demonstrated during the Show and Tell session

    Speech spectral envelope estimation throught explicit control of peak evolution in time

    No full text
    International audienceThis work proposes a new approach to estimating the speech spectral envelope that is adapted for applications requiring time-varying spectral modifications, such as Voice Conversion. In particular, we represent the spectral envelope as a sum of peaks that evolve smoothly in time, within a phoneme. Our representation provides a flexible model for the spectral envelope that pertains relevantly to human speech production and perception. We highlight important properties of the proposed spectral envelope estimation, as applied to natural speech, and compare results with those from a more traditional frame-by-frame cepstrum-based analysis. Subjective evaluations and comparisons of synthesized speech quality, as well as implications of this work in future research are also discussed
    corecore